ROMANIAN JOURNAL OF INFORMATION 
SCIENCE AND TECHNOLOGY 
Volume 10, Number 2, 2007, 145-156 


Chomsky’s Hierarchy & A Loop-Based 
Taxonomy for Digital Systems 


G. STEFAN 


Politchnica University of Bucharest, Romania 
& BrightScale, Inc., Sunnyvale, CA 
E-mail: gstefan@brightscale.com 


Abstract. Formal languages are supported by digital systems having a 
correspondent hierarchy. We show that Chomsky’s taxonomy is paralleled by 
a loop-based hierarchy for digital systems. The machines used to recognize or 
generate a type n formal language belong to a specific class, differentiated from 
the one used for a type (n— 1) language (for n = 1, 2,3). The difference between 
is given by an additional hardware loop closed in the last. Each new loop 
s the autonomy of the system, making it able to support a more “expres- 
sive” language with a less restrictive grammar. We prove the correspondence 
between type 3 languages and 2-loop machines, between type 2 languages and 
3-loop machincs, and between type 1 languages and 4-loop machines. 


1. Introduction 


The taxonomy proposed by Noam Chomsky [2, 3, 4] fits perfect with the actual 
formal languages, without providing any idea about a unique guiding principle work- 
ing behind this hierarchy. Maybe there is no such a unique principle. Or maybe there 
are some indirect clues to be emphasized starting from the way the associated ma- 
chines are structured. Consequently, we intend to provide an indirect solution, as a 
first step on the way to find the supposed guiding principle. The guiding principle 
working for the associated machines will provide an indirect answer to our legitimate 
question: “what is that unique criterion able to explain the difference between a type 
n language and a type (n — 1) language (for n = 1, 2,3)?” 

The well known correspondence between the language type (regular, context-free, 
contert-s tive) and the associated machine (finite automata, push-down automata, 
linear-bounded automata) helps us to move the search of the guiding principle from 
the domain of languages into the co-domain of the machines. 
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The next section is about how a digital em grows in size or gains new features. 
A taxonomy of digital systems, based on the featuring mechanism, is presented. The 
third section shows the direct correspondence between the language type and the 
minimum number of the loops included each other inside the associated machines. 
The fourth part contains comments on the number of loops, on machine complexity, 
and on language efficiency. A concluding section ends this paper. 


2. Growing by Compositions & Featuring by Loops 


A digital system computes partial recursive functions [6]: f : {0,1}" + {0,1}™. 
It uses basic functions (initializati merement, selection) and applies rules (compo- 
sition, primitive recursiveness, partial recursiveness). For the basic functions, linear- 
size and poly-log-time combinational solution are provided. There are two mecha- 
nisms governing the structuring process in digital systems: growing the size! and 
adding features. The first is driven by the composition rule and the second by both, 
primitive & partial recursiveness. 

In Figure 1 we present the physical structure associated with the composition 
defined as: 
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Fig. 1. Implementing the composition rule. The composition of the 
function g with the functions ho,...,hm—1 means a two lev stem. Results a 
two-dimension expansion rule in digital systems. The functions ho, 


Rm-1 


are parallel expanded, and the function g represents a serial expansion. 


‘We consider the size of a machine (circuit) as distinct from its complezity. Size is expressed 
in the number of elementary components (2-input gates, for example) used to build the system. 
The complexity refers to the minimal size of the description (number of symbols, for example) 
used to specify a system. This definition of complexity is inspired by the Chaitin’s algorithmic 
complesity [1]. A complex system has its size in the same magnitude order with its complexity 
(Scompleasystem ~ Ccompleasystem)- See details in [9], [10]. 
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New kinds of functionality occur only when recursiveness is involved. Primitive 
recursiveness introduces data loop and partial recursiveness asks for the control loop. 
Data loop optimize the size of the circuit, while the control loop is mandatory for 
“directing” the computational process. 

The operation of closing loops adds new features in a digital system. The no-loop, 
combinational family of circuits are able to solve all aspects of computation if no 
actual solutions are considered. But for real and optimal solutions successive loops 
are closed generating a circuit hierarchy. Thus, the loop-based taxonomy of digital 
systems (see for details [9]) generates the following open list of systems: 
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Fig. 2. Examples of circuits belonging to different orders. A combina- 
tional, no-loop circuit is an OOS. A memory, one-loop circuit is 01S. Because 
the register belongs to O1S, closing a loop containing a register and a combi- 
national circuit (which is an OOS) results an automaton: a circuit in 02S. Two 
loop-connected automata — a circuit in O3S — works as a processor. An example 
of O4S is a simple computer obtained by loop connecting a processor with a 
memory. The cellular automaton contains a number of loops related with the 
number of automata it contains. 


0-loop combinational systems we call order 0 systems (OOS), include all kinds of 
no-loop gate networks (adders, multiplexors, decoders, 
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1-loop memory systems we call order 1 systems (O1S), include all circuits with 
one level of loops closed inside them (latches, clocked latches, random access 
memories, file registers, master-slave flip-flops, registers, . . .) 


2-loop automata systems we call order 2 systems (O2S), include systems with 
two level of loops closed inside, one for providing the memory function and 
the second allowing autonomous behaviors (JK flip-flops, T flip-flops, counters, 
automata, finite automata, file registers with ALU, ...) 


3-loop processing systems we call order 3 systems (03S), include systems with the 
third loop closed over systems already containing 2 loops (counter automata, 
stack automata, simple RISC processors, . . .) 


4-loop computing systems we call order 4 systems (O4S), include systems with 
the fourth loop closed over an O3S (microprogrammed processors, micro- 
controllers, simple computers, .. .) 


n-loop self-organizing systems we call order n systems (OnS), include systems 
with the number of internal loops in the same order of magnitude as the size of 
the system (cellular automata, complex neural networks, ...). 


Although families of OOSs have the competence to solve any computation, loops 
are used to provide new features very helpful in providing effective solutions. Each 
new loop adds a certain degree of autonomy which improves the “ability” of the 
system to solve efficiently more complex problems. In this paper will be proved that 
each new loop is associated with the removal of a restriction applied to the set of 
productions defining a gencrative grammar. 


3. Grammar Type & Number of Loops 


In this main section we aim to prove the consistency of the featuring mechanism 

in digital systems (shortly presented in the section 2) with another hierarchy em- 
phasized in a related domain: Chomsky’s formal language theory. Some important 
things happen when a new loop is added in a digital system, because it is the way 
to move from a machine associated with a type of formal language toward the ma- 
chine associated with the next more expressive” language in the hierarchy. Let us 
examine this strange effect of the correlation between the machine’s autonomy and 
e expressiveness of the language. 
A short review on Chomsky’s hierarchy of grammars is helpful. A grammar is 
lefined by the following finite sets of symbols: non-terminals N = {S, A, B,...}, 
where S is the special starting symbol, terminals T = {a,b,...}, and productions, P, 
having the form L + R, where L, RE {N UTY. 


Definition 1. According to Chomsky’s taxonomy there are the following type of 
generative grammars introduced by adding in each step a new restriction to the way 
the productions are defined. Results: 
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type 0 grammars generate recursively enumerable languages, Lo, using rules with 
the restriction: L = any string with at least one non-terminal 


type 1 grammars generate context-sen 
additional restr 
process the s 


itive languages, Lı, using rules with the 
ction: R = any string no shorter than L (in the generation 
ng cannot be shortened) 


type 2 grammars generate context-free languages, £2, using rules with the supple- 
mentary restriction: L = one symbol (which must be, according to the first 
restriction, a non-terminal) 


type 3 grammars generate regular languages, £3, using rules with the supplemen- 
tary restriction: R = a|zY, where x € T and Y € {N} (the terminals are 
generated segregated from the non-terminals, or the stream of symbols grows 
at one end only). 


Each new restriction applied to the way productions are defined reduces the “ex- 
pressivity” of the associated language. 

Also, there is a very well known correspondence between formal languages and the 
associated machines. All these machines are finite automata plus a sort of memory. 
The more featured is the memory, the less restrictive is the associated language. 


- L3 + Finite Automata (FA): a machine with cycle level memory (it stores only 
what can be stored in one finite word called state) 


- Lə + Stack Automata (SA): a finite automaton with stack memory, a 
destructive-read memory (the stack memory stores a stream of data, but it 
“forgets” the information when it delivers it) 


- Lı 6 Linear Bounded Memory Automata (LBA): a finite automaton with finite 
non-destructive read memory (or an SA with an additional stack to save the 
data read from the main stack) 


- Lo ++ Turing Machine (TM): a finite automaton with infinite non-destructive 
read memory. 


Each actual machine is defined, and has the complexity given by a finite Boolean 
table defining the transition function of the associated finite automaton. The resulting 
function is implemented as a combinational circuit. The rest consists in different 
simple memory functions: registers, stacks, queues (in Post’s definition), list memory 
(in Turing’s definition), random access memory (in other equivalent definitions), .... 
Let’s see how from FA to LBA we can go only closing appropriate loops over simple 
subsystems. 


3.1. Type-3 Grammars & 2-Loop Machines 


The goal of this subsection is to prove that an optimal digital system must have at 
least two internal loops for recognizing or generating the regular (type 3) languages. 
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In textbooks on formal languages it is stated that each type 3 language can be 
recognized/generated by the final states of an initial deterministic automaton (see, for 
example [5]). Indeed, according to the way the productions are defined, they can be 
reduced to the following two forms: A + aB and A > a. At each generating step, 
one terminal (a) is added to the string, and it is preserved (for A —> aB) or not (for 
A > a) the possibility of a new similar action. Therefore, each symbol in a well formed 
string depends only by its predecessor. The value of the last reccived/ generated 
symbol can be ”memorized” by the internal state of a finite automaton. Each state 
of the automaton can say something only about the last processed symbol and that 
is enough. Because the sets N, T, and P are finite, the automaton that recognizes 
any language L(G) € £3 can be defined as a finite machine. 


Theorem 1. The lowest order of a system that implements any finite automaton 
is two. 


Proof. The kernel of a finite automaton is the associated half-automaton defined 
by the following triplet: 
=(X,Q, f) 


where: X is the finite input set, Q is the finite set of the states, and f : X x Q > Qis 
the state transition function. The general structure of a half-automaton is presented 
in Figure 3, where: 
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Fig. 3. The internal structure of a half-automaton. The 
second loop is closed through two simple latches (containing the 
first level of loops) and a complex combinational logic circuit. 


- CLC is a combinational logic circuit that computes the transition function f; 
— Slave Latch is a collection of one-bit latches that store the current state; 


— Master Latch is another latch that allows to close properly the loop over the en- 
tire system (it has only an electrical function allowing the synchronous behavior 
of the system). 


In this system there are two level of loops: 


- the first loop level in each one-bit latch, allows the storing function; 
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— the second loop level is imposed by the state transition function f which is 
defined in X x Q and with values in Q (the current value of the variable Q 
determines the next value of the same variable). o 


In the context of regular language processing we can say that the two levels of 
loops are necessary and sufficient to manage regular languages because: (1) the first 
loop is used to build the circuit that stores the last received or generated symbol: the 
slave latch from the state register; (2) the second loop, closed through register and 
the combinational circuit, is for sequencing the process of recognition/gencration. No 
more memory is needed because the productions are very simple. The finite automata 
are the simplest and the smallest digital machines that recognize and generate regular 
strings. We can define a more structured optimal machine, but never a less structured 
optimal one having the order 1 or 0. 


3.2. Type-2 Grammars & 3-Loop Machines 


We are expecting that the step towards the more expressive languages belonging 
to Lə should require a more autonomous machine to recognize or to generate them. 
Can automata work at the level of context-free languages? Yes, they work, but not 
as finite automata, because only the “infinite” automata can do the job. If we don’t 
agree with ”infinite” automata, then a third order system must be used. If we wish 
to use an “infinite” automaton to recognize strings belonging to the second type 
language, then an automaton having |Q| € O(n) must be used, where: n is the length 
of the string and |Q| is the dimension of Q. But our aim is to keep the complexity 
at the lowest possible level. In this respect we must find a circuit solution having its 
complexity constant, i.e., independent of n. 

Let us start with a short discussion about the classical example offered by the 
language {a”b"|n > 0} € Lo. If we want to recognize this language using an automa- 
ton, then the problem raised is to know how many of as are received before the first 
occurrence of b. The machine must memorize somewhere the number of as. This 
place for an automaton is its “state space”, but in this case the automaton becomes 

n “infinite” machine. The solution is an additional memory. 

Going back to textbooks we find that two type languages can be recog- 
nized/generated by pushdown automata. 

A simple rema PDA is an efficient machine because the automaton is a complex 
finite machine and the stack is an infinite but simple, recursively defined machine. 


Theorem 2. The lowest order of a system that implements a push-down automa- 
ton is three. 


Proof. Because the push-down automata is build using a finite automaton loop 
coupled with a push-down stack, it is a third order system. Indeed, a finite automaton 
is a second order system and the push-down stack has the same order because it is 
a simple, recursively defined “infinite” automaton. It is an automaton because its 
simplest implementation (see Figure 4) needs a register loop connected with a multi- 
plexor. If the content of the stack (stored in Register (sce Figure 4)) is {so sety 
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then: data-out = so, and for {push, pop} = {1,0}, and datain = x the multiplexor 
selects to the input of Register {, so, $1,...}, for {push, pop} = {0, 1} selects to the 
input of Register {s1, s2, 5: and for {push, pop} = {0,0} the multiplexor selects 


to the input of Register {s9, 51, 42...}. [m] 
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Fig. 4. Push-Down Automata. The stack belongs to O2S because 
it is a simple, recursive defined, “infinite” automaton (a register (O1S) 
loop connected with a multiplexor (00S). Results PDA is an O38. 


The stack is the simplest memory device because: (1) it stores only one string; (2) 
the string is accessed only to one of its ends (last-in-first-out); (3) the read operation 
is destructive (the memory forgets the information read because of the access type). 
The simplicity is the reason for using this memory to build the first machine more 
complex than a finite automaton. But, the same simplicity is the reason for which 
we must renounce to this memory if we want to approach the next type of languages. 
For the context-s ve languages we need a memory in which we can access many 
times the same stored content. 


3.3. Type-1 Grammars & 4-Loop Machines 


When we try to build a machine associated to the language {a"b"c"|n > 0} € Ly 
we must add a supplementary memory resource in order to deal with the cs. This 
leads us toward the third loop. 

In the general case, we can use for languages belonging to Lı a finite defined 
machine only by adding, to the pushdown stack automaton, a new push-down stack 
[7] (see Figure 5). In this case a new loop is closed in the machine and it becomes 
a fourth order system. The new stack is used to compensate the limitation of the 
stack memory which forgets when it is read. Thus, the simplest automaton having 
associated a non-destructive memory is made by adding a new pushdown stack to a 
PDA. For each POPO from STACK 0 a PUSH1 with the read symbol in STACK 1 
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is performed. For each POP1 from STACK 1, a corresponding PU SHO can be made 
in STACK 0. The sizes of each stack can be linearly bounded to the string length. 


the second loop 


the third loop Q the fourth loop 
—s p 
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Fig. 5. PDA with stack memory. FA & STACK 0 = PDA. 

Adding STACK 1 a fourth loop is closed. Any symbol read from 

STACK 0 can be inserted in STACK 1 and viceversa. STACK 0 
& STACK 1 behaves like a list memory. 


The fourth loop of the system is necessary because it gives us access to a new 
external memory. This additional memory was imposed because another restriction 
has been removed when £; enters the scene. 

It is known that context-sensitive languages are recognized/generated by linear 
bounded automata. 


Theorem 3. The lowest order of a system that implements a linear bounded 
memory automaton is four. 


Proof. The simplest memory having non-destructive reading can be made by loop 
connecting two push-down stack memories. Because a push-down stack is a second 
order system, a memory with non-destructive reading is a third order system and the 
resulting system is a fourth order system (see Figure 5). An equivalent structure is 


presented in Figure 6, where: 


- FAisa finite automaton (a second order system) 


- U/D COUNTER (another second order system) is an ”infinite” automaton 
with a simple structure (its complexity is Cy/p covwrer € O(1), even if its 
size is Sy/p counter € O(logn)) used to point a symbol in memory 


- RAM isa random access memory (a first order system) used to store the string 
(its complexity is Cram € O(1), even if its size is Sram € O(n)) 


The resulting structure has also two loops closed over a finite automaton. [m] 
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The hardware requirement for context-sensitive languages implies a more struc- 
tured and a more functional segregated machine. This machine has two supplemen- 
tary loops added to an automaton with two distinct roles: (1) the first, through 
RAM, in order to access an external memory support; (2) the second, through 
U/D COUNTER and RAM, in order to access an external memory function: a 
bi-directionally scanned list. 

The list can support a more complex computation than the stack. Both are strings, 
but the second allows only a limited and destructive access to the content of the string. 
Ina memory hierarchy the list has a higher ” order” because it is implemented by two 
loop connected stacks. 


the second loop 


{UP, DOWN, -} 


Q >| U/D COUNTER 
FA 


The fourth loop 


Ld pry) ADR 
RAM 
DOUT 


The third loop 


Fig. 6. Automaton with Linear Bounded Memory. FA uses 
the loop connected RAM as storing device. The fourth loop closed 
through the counter organizes the RAM’s content as a list. 


4. Loops & Complexity 


An automaton dealing with an £> language has the size of its space state in 
O(n), where n is the maximum length of the string to be processed. Accordingly, 
the complexity of the associated combinational circuit becomes unmanageable. The 
problem is solved introducing a new concept and a new machine: PDA. What we 
did substituting an “infinite” automaton with a PDA? We segregated the “hidden” 
simple part of the “infinite” automaton from its inherent complex part. The stack 
is a big, but simple subsystem, and the remaining finite automaton is a small, but 
complex subsystem. Because they are loop connected the resulting machine exert an 
increased autonomy without a significantly increased complexity. 

Thus, adding a new loop, the big & simple part — the stack — and the small & 
complex part — the finite automaton — are very clearly emphasized. Results an optimal 
structure, where the complexity is minimized and the big parts are simple. 
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It looks like the closing of a new loop helps to increase the complexity of pro- 
cessing maintaining the complexity of the machine at a similar, low level. Indeed, 
the only complex structure in PDA or LBA is the finite automaton, more precisely 
its combinational circuit. All the other components involved in defining different 
machines (registers, multiplexors, counters, RAMs) are simple because they have 
constant sized, recursive definitions. The size of their definitions does not depend on 
the dimensions of the sets N, T, and P, or on the maximum length of the recog- 
nized or generated strings of symbols. Fortunately, the definition’s size for the finite 
automaton depends only on the finite sizes of the sets N, T, and P. 

The list of meaningful machines ends with the Turing Machine (TM). It can be 
represented as the LBA with a very important adagio: both, the dimension of the 
counter and of the memory are considered infinite. The non finiteness of the memory is 
needed because the restriction “R = any string no shorter than L” does not act, 
thus, the intermediary length of a string during its generation can not be predicted, 
and consequently can be bigger than the final form. 

The path from the families of combinational circuits toward the TM has its im- 
portant milestones marked each time by a new level of loops. The autonomy of the 
resulting ems increases with each additional loop. This process culminates in the 
possibility of building a 0-State Universal TM: the simplest possible computational 
structure (see [10]). In a 0-State Universal TM the segregation between the complex 
part and simple part is maximal. Indeed, the complex part of programs stored in 
memory are completely segregated by the simple part of the hardware designed using 
only simple, recursively defined structures. 

The first consequence of closing loops is the possibility to segregate simplicity 
from complexity, thus reducing the actual complexity of the machines involved in 
computation. 

The second consequence, related with the first, is the increased autonomous be- 
havior of the system when it g; anew loop. The autonomy of the machines evolves 
in parallel with the “expressivity” of the associated languages. 


5. Concluding Remarks 


1. The actual structure of machines associated with formal languages tends to min- 
imal complexity. In this respect we can state the following proposition: Digital ma- 
chines that recognize and generate formal languages can be “infinite” (big sized) ma- 
chines, but they must have finite definitions (small complexity). 


2. The most important conclusion of this paper is that there exists a direct corre- 
spondence between: 


- L3 4 O28 
- L2 + O38 
- Lı + 04S 
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The unique principle governing the loop-based taxonomy can be mirrored by a similar 
unique principle in language hierarchy? This remains for the time being an open 
question. 

TM docs not have associated a distinct order in the presented structural hierarchy 
of digital systems, because it is only a pure theoretical model. 

Between the context-sensitive, type 1 languages and the type 0 languages there are 
many other types of languages, corresponding to less restricted productions defining 
their grammars. These languages are out of our interest because the majority of 
programming languages are context-free. Systems having the order more than four 
are associated to these hypothetical languages. 


3. The 0-state, control free Universal TM shows us that the evolution of the com- 
puting machines converted the hardware complexity into the software complexity. 
Nowadays VLSI technologies can build big sized circuits only if they are simple. The 
structural complexity cannot grow with the same speed as the size. We must avoid 
growing the complexity in order to preserve our ability to build very large circuits. 
Fortunately, the loops closed inside the digital systems provide a lot of help in this 
respect. 
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